88 research outputs found
Modeling the Resource Requirements of Convolutional Neural Networks on Mobile Devices
Convolutional Neural Networks (CNNs) have revolutionized the research in
computer vision, due to their ability to capture complex patterns, resulting in
high inference accuracies. However, the increasingly complex nature of these
neural networks means that they are particularly suited for server computers
with powerful GPUs. We envision that deep learning applications will be
eventually and widely deployed on mobile devices, e.g., smartphones,
self-driving cars, and drones. Therefore, in this paper, we aim to understand
the resource requirements (time, memory) of CNNs on mobile devices. First, by
deploying several popular CNNs on mobile CPUs and GPUs, we measure and analyze
the performance and resource usage for every layer of the CNNs. Our findings
point out the potential ways of optimizing the performance on mobile devices.
Second, we model the resource requirements of the different CNN computations.
Finally, based on the measurement, pro ling, and modeling, we build and
evaluate our modeling tool, Augur, which takes a CNN configuration (descriptor)
as the input and estimates the compute time and resource usage of the CNN, to
give insights about whether and how e ciently a CNN can be run on a given
mobile platform. In doing so Augur tackles several challenges: (i) how to
overcome pro ling and measurement overhead; (ii) how to capture the variance in
different mobile platforms with different processors, memory, and cache sizes;
and (iii) how to account for the variance in the number, type and size of
layers of the different CNN configurations
DFT-Spread Spectrally Overlapped Hybrid OFDM-Digital Filter Multiple Access IMDD PONs
A novel transmission technique—namely, a DFT-spread spectrally overlapped hybrid OFDM–digital filter multiple access (DFMA) PON based on intensity modulation and direct detection (IMDD)—is here proposed by employing the discrete Fourier transform (DFT)-spread technique in each optical network unit (ONU) and the optical line terminal (OLT). Detailed numerical simulations are carried out to identify optimal ONU transceiver parameters and explore their maximum achievable upstream transmission performances on the IMDD PON systems. The results show that the DFT-spread technique in the proposed PON is effective in enhancing the upstream transmission performance to its maximum potential, whilst still maintaining all of the salient features associated with previously reported PONs. Compared with previously reported PONs excluding DFT-spread, a significant peak-to-average power ratio (PAPR) reduction of over 2 dB is achieved, leading to a 1 dB reduction in the optimal signal clipping ratio (CR). As a direct consequence of the PAPR reduction, the proposed PON has excellent tolerance to reduced digital-to-analogue converter/analogue-to-digital converter (DAC/ADC) bit resolution, and can therefore ensure the utilization of a minimum DAC/ADC resolution of only 6 bits at the forward error correction (FEC) limit (1 × 10−3). In addition, the proposed PON can improve the upstream power budget by >1.4 dB and increase the aggregate upstream signal transmission rate by up to 10% without degrading nonlinearity tolerances
Real-time Neural Radiance Talking Portrait Synthesis via Audio-spatial Decomposition
While dynamic Neural Radiance Fields (NeRF) have shown success in
high-fidelity 3D modeling of talking portraits, the slow training and inference
speed severely obstruct their potential usage. In this paper, we propose an
efficient NeRF-based framework that enables real-time synthesizing of talking
portraits and faster convergence by leveraging the recent success of grid-based
NeRF. Our key insight is to decompose the inherently high-dimensional talking
portrait representation into three low-dimensional feature grids. Specifically,
a Decomposed Audio-spatial Encoding Module models the dynamic head with a 3D
spatial grid and a 2D audio grid. The torso is handled with another 2D grid in
a lightweight Pseudo-3D Deformable Module. Both modules focus on efficiency
under the premise of good rendering quality. Extensive experiments demonstrate
that our method can generate realistic and audio-lips synchronized talking
portrait videos, while also being highly efficient compared to previous
methods.Comment: Project page: https://me.kiui.moe/radnerf
- …